Search CORE

410 research outputs found

Deep Learning for Semantic Part Segmentation with High-Level Guidance

Author: Kokkinos I.
Papandreou G.
Tsogkas S.
Vedaldi A.
Publication venue
Publication date: 01/01/2015
Field of study

In this work we address the task of segmenting an object into its parts, or semantic part segmentation. We start by adapting a state-of-the-art semantic segmentation system to this task, and show that a combination of a fully-convolutional Deep CNN system coupled with Dense CRF labelling provides excellent results for a broad range of object categories. Still, this approach remains agnostic to high-level constraints between object parts. We introduce such prior information by means of the Restricted Boltzmann Machine, adapted to our task and train our model in an discriminative fashion, as a hidden CRF, demonstrating that prior information can yield additional improvements. We also investigate the performance of our approach ``in the wild'', without information concerning the objects' bounding boxes, using an object detector to guide a multi-scale segmentation scheme. We evaluate the performance of our approach on the Penn-Fudan and LFW datasets for the tasks of pedestrian parsing and face labelling respectively. We show superior performance with respect to competitive methods that have been extensively engineered on these benchmarks, as well as realistic qualitative results on part segmentation, even for occluded or deformable objects. We also provide quantitative and extensive qualitative results on three classes from the PASCAL Parts dataset. Finally, we show that our multi-scale segmentation scheme can boost accuracy, recovering segmentations for finer parts.Comment: 11 pages (including references), 3 figures, 2 table

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

HOP: Hierarchical object parsing

Author: A. Yuille
I. Kokkinos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Crossref

UCL Discovery

eScholarship - University of California

HoloPose: Holistic 3D Human Reconstruction In-The-Wild.

Author: Güler RA
Kokkinos I
Publication venue: Computer Vision Foundation / IEEE
Publication date: 20/06/2019
Field of study

We introduce HoloPose, a method for holistic monocular 3D human body reconstruction. We first introduce a part-based model for 3D model parameter regression that allows our method to operate in-the-wild, gracefully handling severe occlusions and large pose variation. We further train a multi-task network comprising 2D, 3D and Dense Pose estimation to drive the 3D reconstruction task. For this we introduce an iterative refinement method that aligns the model-based 3D estimates of 2D/3D joint positions and DensePose with their image-based counterparts delivered by CNNs, achieving both model-based, global consistency and high spatial accuracy thanks to the bottom-up CNN processing. We validate our contributions on challenging benchmarks, showing that our method allows us to get both accurate joint and 3D surface estimates, while operating at more than 10fps in-the-wild. More information about our approach, including videos and demos is available at http://arielai.com/holopose

Crossref

UCL Discovery

Recommended from our members

WCDMA for Air to Ground and Ground to Air communications – Case Study for Greek airspace

Author: Kokkinos E
Nilavalan R
Peteinatos I
Publication venue
Publication date: 01/01/2016
Field of study

In this paper the capacity of the Air-to-Ground and Ground to Air system is studied. The Outside Cell Interference Factor is estimated through simulations for uplink and downlink. The calculation algorithm is explained as well as the scanning mode of interfering cells around the desired cell. Results for the number of active users for various values of the maximum height of the cell and its radius are presented. A Case Study for the Athens, Thessaloniki and Iraklio airports has been made in which the number of the users per cell is being calculated for voice service of 12.2 kbps and video calls of 64 and 128 kbps

Brunel University Research Archive

Softmesh: Learning Probabilistic Mesh Connectivity via Image Supervision

Author: Kokkinos I
Le ET
Mitra NJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/01/2022
Field of study

In this work we introduce Softmesh,a fully differentiable pipeline to transform a 3D point cloud into a probabilistic mesh representation that allows us to directly render 2D images. We use this pipeline to learn point connectivity from only 2D rendering supervision,reducing the supervision requirements for mesh-based representations.We evaluate our approach in a set of rendering tasks,including silhouette,normal,and depth rendering on both rigid and non-rigid objects. We introduce transfer learning approaches to handle the diversity of the task requirements,and also explore the potential of learning across categories. We demonstrate that Softmesh achieves competitive performance even against methods trained with full mesh supervision

UCL Discovery

Deep Spatio-Temporal Random Fields for Efficient Video Segmentation.

Author: Chandra S
Couprie C
Kokkinos I
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/07/2018
Field of study

In this work we introduce a time- and memory-efficient method for structured prediction that couples neuron decisions across both space at time. We show that we are able to perform exact and efficient inference on a densely-connected spatio-temporal graph by capitalizing on recent advances on deep Gaussian Conditional Random Fields (GCRFs). Our method, called VideoGCRF is (a) efficient, (b) has a unique global minimum, and (c) can be trained end-to-end alongside contemporary deep networks for video understanding. We experiment with multiple connectivity patterns in the temporal domain, and present empirical improvements over strong baselines on the tasks of both semantic and instance segmentation of videos. Our implementation is based on the Caffe2 framework and will be available at https://github.com/siddharthachandra/gcrf-v3.0

arXiv.org e-Print Archive

Crossref

UCL Discovery

UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory

Author: Kokkinos I
Publication venue: 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publication date: 01/01/2017
Field of study

In this work we train in an end-to-end manner a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture. Such a network can act like a swiss knife for vision tasks, we call it an UberNet to indicate its overarching nature. The main contribution of this work consists in handling challenges that emerge when scaling up to many tasks. We introduce techniques that facilitate (i) training a deep architecture while relying on diverse training sets and (ii) training many (potentially unlimited) tasks with a limited memory budget. This allows us to train in an end-to-end manner a unified CNN architecture that jointly handles (a) boundary detection (b) normal estimation (c) saliency estimation (d) semantic segmentation (e) human part segmentation (f) semantic boundary detection, (g) region proposal generation and object detection. We obtain competitive performance while jointly addressing all tasks in 0.7 seconds on a GPU. Our system will be made publicly available

UCL Discovery

UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory

Author: Kokkinos I
Publication venue: 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publication date: 07/09/2016
Field of study

arXiv.org e-Print Archive

UCL Discovery

Deep Filter Banks for Texture Recognition, Description, and Segmentation

Author: Cimpoi M
Kokkinos I
Maji S
Vedaldi A
Publication venue
Publication date: 01/05/2016
Field of study

Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications. In this paper we make several contributions to texture understanding. First, instead of focusing on texture instance and material category recognition, we propose a human-interpretable vocabulary of texture attributes to describe common texture patterns, complemented by a new describable texture dataset for benchmarking. Second, we look at the problem of recognizing materials and texture attributes in realistic imaging conditions, including when textures appear in clutter, developing corresponding benchmarks on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic texture represenations, including bag-of-visual-words and the Fisher vectors, in the context of deep learning and show that these have excellent efficiency and generalization properties if the convolutional layers of a deep model are used as filter banks. We obtain in this manner state-of-the-art performance in numerous datasets well beyond textures, an efficient method to apply deep features to image regions, as well as benefit in transferring features from one domain to another

UCL Discovery

Lightning protection of wind turbines - a comparison of measured data with required protection levels

Author: Cotton I.
Kokkinos N.
Krogh T.
Peesapati Vidyadhar
Sorensen T.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2011
Field of study

The University of Manchester - Institutional Repository